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i S BA D ^ VII M \D METHOD OR Pi 1 1 - \_ 

AND PREVENTING TRANSMISSION OF UNWANTED E-MAIL- HASH ■ B ASED SYSTEMS 
AND METHODS FOR DETECTIN G * PREVENTING, AND TRACING NETWORK 

BACKGROUND OF THE INVENTION 
Field of the Invention 

The present invention relates generally to network security and, more particularly, to systems 
and methods for detecting and/or preventing the transmission of unwanted e-mails m &fe kms 
packets, such as e-mails containing worms and viruses, including polymorphic worms and 

Descr i ption of Related Art 

Availability of low cost computers, high speed networking products, and readily available 
network connections has helped fuel the proliferation of the Internet. Tins proliferation has 
caused the Internet to become an essential tool for both the business community and private 
individuals. Dependence on the Internet arises, in part because the internet makes it possible for 
multitudes of users to access vast, amounts of information and perform remote transactions 
expeditiously and efficiently. Along with the rapid growth of the internet have come problems 

network and the ^heat <■ i : i mmen \l e-mail As the size of the internet continues to 
grow, so does the threat posed to users of tire Internet 

(0OO1| Many of the problems take the f orm of e- mail. Viruses and worms often masquerade 
i i 3 Uj] g 1 \ D ispe i m ecipi -licit mm t 
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e-m ail m; >! --pani," is anther hinder s, ,^ c fype of e-mail because >t; ask - both the time and 
resource • ' i *. •' ad recipient 

[6002] Exist ing t < iojaes rdet ng ru* s ^Qniis^ and.spam examine..eijch..e^Mi|. 
n te ssagt : i.n< lividi taii v In the case < > ! V i ru ses ; u idj vc >rm s jhj s typ i ■ a! h m eaos exa mining 

Utacl una r byt * i d i km mi vini.se ' ams (} bj it i up imzos 

de-archiving attached files), or simulating execution of the attachment in a ''safe'bcoiTrpartment 
and examining its behavior* Smijlajjy^ejg^ munt: > single e-mail 

message looking for heuristic traits commonly found in unsolicited commercial e-mail, such as 
an abundance of Uniform Honour v 1 oi jiois < I HI h i« > -eapital-iettei words, use 
of colored text or large foiUs, and the like, and then "score" the message based on the number 
and types of such. .tfaits found,. JB 

significant processing of each message, adding to the resource btuden im posed b y unwanted e- 
ma.il. Neither technique makes use of information collected from other recent messages. 
|0003J T hus, there is need for an efficient technique thai can quickly detect viruses, worms, 
and spam in e-mail messages arriving at e-mail servers, possibly by using information contained 
in ■multiple recent messages to detect unwanted mail more quickly and efficiently. 

[0004] thea e ' individuals. 

The ever-increasing number of computers, routers, and connections making up the Internet 

he^ts or eompnt e ^s; conn e ct e d to ih e n e werk: lu-faet; ■■■■e ach ■ rout e r; ■ witch- or eemput e r- 
eeen e et e d to the Internet may be a potential entry point from wh i ch a ma l icious mdivkhial-can 
launch an attack whil e remaining larg el y und e t e ct e d. Attacks carri e d out on th e internet often 

sw i tefe- ean be compromised and configured to place malicious packets onto the network. 
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pfegm^^b-'a»'a"¥tTO8"eg'weft» r that i s d e sign e d to aBHey4\etwork-^sei ; Sr^y"«etworig"8ef¥i€e 
t>v -tw e rfeadmg the network; or damage target computers <e.g.^y<ieleti^-^jes)-."A"¥miS--ifra 
pregram4to4ftfe €t S"a"e^ ^ ^ 

worm, oh the oth e r hand, is a program that can mak e copies of its e lf and spread itself through 
eeHHe€4 e 4-sys^ e BBy t^i«g"iif> t : esoiirces in effected computers or causing other damage. 

fe - f e e e m -ye ar -s r w 

wast e d- mi l hens of man-hours in ■ clean-^P ' Op e rattoHs^n ' COfporation S' and^^Offl e s-ali-over-tbe 
world. Famous examples include the "Melissa" e-mail virus and the "Code Red" worm 

Various defenses, such as e-mail fitters; anti-vkus programs, and firewall mechanisms; have 
b ee n e mploy e d against virus es and worm s , but with limit e d s ucce s s. Th e d e fens e s oft e n r e ly on 
eemputer-43ased-re« ogn 
propagat i on m e chanism!^ 

develop e d viruses and worms. Th e re is also a n e ed to trac e the path tak e n by a virus or worm. 
SUMM ARY OF THE INVENTION 

Systems and methods consistent with the present invention addre 1 < } and other needs 

by providing a new defense that detect* md m is the tosmt siot t n <>\ anted (and 
! ^ "fully, unwanted) e-s u") , , is e-tuails comaimnu » >j> 1 nouos, and spam. 



3 



In re, U.S. 10/654,771 Changes made to 10/251,403 to create 

CIPapp 10/654,77 1 

{OOOSf ttttae-ks-majioiei t s-p aek- e tfr-sacfa as viruses and worms, at their fiao s t-eommon 

denemlnaterti^ 

s ys*et3*s r wi* e t^^ 

th e plac e at which it was initially inj e ct e d into the n e twork). 

in accordance with an aspect the principles of the invention as embodied and broadly described 
herein, a method for detecting s y s t e m-^ e t e e feHhe-transmission of potentially unwanted e-mail 

! gt g v f L 1L 11 hi h ! ng i_ ! g uui & ig ha 

- n one.: Oxmore.pora^ 
detenimung whether ^ 

ee*yespef w&ft g 4e- e ^^ the generated hash values 

match hash values associated with prior e-mail messages. The method may also include 
determining that one of the e-mail messages is a potentially unwanted e-mail message when one 
or more ot tl ge je at dh. k i souatcd u nh the e-mail message match one or more of 
the hash values associated with the prior e-mail messages. 

{00061 In accordance with another aspect of the invention, a mail server inc ludes one or 
more hash memories and a hash processor. The one or more hash memri . • rej > a figured to 
store count values associated with hash values. The hash proce s- >r is u receive an e- 

nutii mi igt t t n n _pj ti oj f h v m i i 1 ig g m v ■ i nd 

increment the count values corresponding to the generated hash values. The hash processor is 

iuuhei ion k o den ni] dier e y.u\ m ge is a potentially i >ful e-mail 

message based on the incremented count values. 

{00071 In accordance with vet another aspect of the invention, a method for detecting 
UanMiuv-v i k J i-nutl messages pun ided i he method iik hides receiving e-mail 

messages and detecting unwanted e-mail messages of the received e-mail messages based on 
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hashes of previou ;iy receix id e-mail messages, where multiple hashes are performed on each of 
the e-mail messages. 

[6008] Inaccp Rtonie ^ ah a , tiorj a n tctM4.lb£.deMQt|.ng 

transn itssion of potential! ; umvant* -d e-.n lad n tesj a ges is pros aded. I he me tht td inclut Ses 

i ice ins > e-mail messag geru i ru ig ia ] iku bio ks of ti > i , » ,. s n , 

the blocks include at least two of a main text portion, an attachment portion, . and a header portion 
of the e-mail message; determining whether the generated hash values match hash values 
associated with prior e-mail messages; and determinin g that t he e-mail message is a potent iaiiy 
unwanted v- i- e v> hen one or more of the generated ha -d al • . - i - ■■ -1 with the e- 

i one oi i i ej h ilues tssociated with ihe prioi j < 

[0009| In accordance with another aspect of the invention, a mail server in a network of 
cooperating n I e- - .-v provided. The mad -e^ci mJudes ,■ , sh mories and a 

hashjarocessoL The one.or.more.hagh memo 

to hash values corresponding to previously-observed e-mails. The hash processor is configured 
to receive at least some of the hash values from another one or more of the cooperating mail 
servers and store information relating to the at least some of the hash values in at least one of the 
one or more hash memories. The hash processor is further configured to receive an e-mail 
message, hash one or more portions of the received e-mail message to generate hash values, 
determine wl es m .'ih the hash values corresponding to previously- 

obgejrv ed e-mails, and identify the received e-mail message as a potentially unwanted e-mail 

nj ge when one oi m -\ he generated hash values a < »i tl e < 1 tai 

message match one or more of the hash values corresponding to previously-observed e-mails. 
{601 0] k th thcr aspect of die ii \ mail s prov ided. Th 

maiLser\ et •■• dud s -ne or more hash memories and a hash processor. The one or more hash 

memories is intred to store count values associated with hash values. The hash 

puKc^oi \ - 1 , i r c 't i i e _ . a h one nt more portions of the received e- 

mail messages to geneuu -n d ■ J a ,>w_ cment the count sai a - ■ ■■i. e- pondin g to the 
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- _it dha I 'u " em ned t ) ut values v and genci -^ -i-n » i m scores tor the 

c e 1 messag >a ed m the in emt h mi ah t 
[0011] jnaccordMce with.a..fa <\ - I ), a .method.ior. presenting 

m 1 u m | ii l m if i i et] c mg u J | 

n .sagei^ 5 < 1 i j f iJue er po i t th mail m _ u mail n ;ssag< ? 

being received; and incrementally detemiining whether the g enerated hash va lues match hash 
values associated with prior e-mail messages. The method further includes generating a 
suspicion score for the e-mail message based on the incremental determining; and rejecting tire 
e-mail message when the suspicion score of the e-mail message is above a threshold, 

(00121 pftof-pae - k -e te - .-T - h e-e yst e m - fflfty - dcteiTOine that one of the pockets is a potentially 

malicious -paeket^ 

t h ^ Hi sh-v a^ ^ ^ 

According to anoth e r implem e ntation consistent with th e pr e s e nt invention, a system for 
hafflp e rmg-trflmmt s sion of a potentially malicious packet is disclosed. The system includes 
mean s for r e ceiving a pack e t; means for generating on e or more ha s h valu e s from the packet; 

te ast-en e -of- fe ^^^^ 
pi : e4eiewn*ned^^ 

wh e n the packet is determin e d to b e a pot e ntially malicious packet- 
According to yet another impl e m e ntation con s ist e nt with th e pres e nt invention, a m e thod for 
storing- imsh-vakfes-tfoe^ 
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peftm»*iy^ 

pmm thdly mafei ens packet was one of the r e c e i v e d packets w ben one -or- mere -of the gen e rat e d 
BRIEF DESCRIPTION OF THE DRAWINGS 

The accompanying draw ings, which are incorporated in and constitute a part of this 
specification, illustrate the invention and, together with the description, explain the invention. In 
the drawings, 

FIG. 1 is a diagram of a system in which, systems and methods consistent with the present 
invention may he implemented; 

FIG. 2 is an exemplary diagram of the e-mail a security server of FIG. 1 according to an 
implementation consistent with the principles of the invention; 

FIG Hs an exemplary fu etiona liagrai »i Ftg 2 p a ek e t - d e w4o ft 

legie according to an implementation consistent with the principles of the invention; 
10013} Fig. 4 is an exemplary diagram of the hash processing block of Fig. 

FIGS. 4A and 4B illustrat e two possibl e data structur e s stored within th e hash memory of FIG. 3 

ling t m in lenientation k-if«plei^> e n4atiorvs consistent- with the principles of the 
invention; and 

10014} Figs. 5A-5E are flowcharts 

j^9r44»«4leweteH)f exemplary processing for detecting and/or preventing transmission of an 
mm an tec e-ma i ..i^ca-ma-Mc-iora- pi^ket, such as an e-mail containing a virus or worm. 
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1 1 i ; ma^-t>t4H>^4i--»-a-t>4i-n;-> k 



PIG- -44s a po i . yniof phjc fl owehart-ef exemplary proc e ssing for identifying 

8 -nfrtwerk by a malicious packet:, such as a virus or worm., OL.an..uiisoljcried commercial e-mail, 

according to an implementation consistent with the principles of the inventionrand 

FIG. 7 is a flowchart of exemplary proc e ssing for d e t e rmining wh e ther a malicions packet, such 



DETAILED DESCRIPTION 

The following detailed description of the invention refers to the accompanying drawings. The 
same reference numbers in different drawings may identify the same or similar elements. Also, 
the following detailed description does not limit the invention. Instead, the scope of the invention 
is defined by the appended claims and equivalents. 

Systems and methods consistent with the present invention provide yims,,worm, and unsolicited 
e-mail detection and/or prevention in e-mail servers. Placing these features in e-mail servers 
punuks a rru ml g new a ,, a y j Khufrng »he abs m to align hash blocks to crucial 
boundaries found in e-mail messages and eliminate certain counter-measures by the attacker, 
such as using small Internet Protocol (IP) fragments to limit the detectable content in each 
packet. It also allows these features to relate e-mail header fields with the potendaily-harmfu] 
egl entofi >sa ge (t rally an "attachment"), and decode common file-packing and 
encoding io t , \ iun or worm undetectable by the packet-b ased 

technjiju (e\g,., ' / p files" ) 

(0O15J pi u j thes i} itu t vithi ,<> ! e nja the vibtlm to daeu scpiicated 

' ' ' ' ' ■ - U t e lame quantities of traffic are prese nt is obtained. EH' 



i, has been obs e rv e d - according 



implet 



■w i th-th e 
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relating many othei wj se^nd t pendent mes sa ges a; i J h n > i ■ ig con imoj i fn >; tlie e-mail server 
ma' detect unknown is t eh is known, a ruses md vvorms I he t feature may also be applied 
to detect potential unsolicited commercial e-mail ("spam"). 

[0016} I x ! „ n got Int ernes Sea . in Pn \ lets i SSI t t mdiiou e- 

maii messages a moi _ - gle set vj: ten ^ fl an rms at 5 v u 

network, a substantial fraction of this e-mai l may actually be traffic generated b y the vi ms or 
worm. Thus, an e-mail server may have dozens to thousands of examples of a single e-mail- 
borne virus pass through it in a day, offering an excellent op portunity to determine the 
relationships between e-mail messages and detect replicated content (a feature that is indicative 

fi us/ wot propag io 'id i ,n u ig ot cr, n m dm t if tic (si ] ' 'u Yom 

1 kimau , . Hngjj 

[0O17J Systems and methods consistent with the principles of the invention provide 
mechanisms to detect and stop e-mail-borne viruses and worms before the addressed user 
receives th em, in an em u< rnj - - ,u : > the vims is still inet i. Cm rem e-mail servers do not 
n • ' Lib execute any code in the e-mail being transported, so they are not usually subject to 
virus/worm infections from the content of the e-mails they process - though, they may be subject 
to infection via other forms of attack. 

I'0018j Besides e-mail-borne viruses and worms, another common problem found in e-mail is 
mass -e-maii ing of unsolicited commercial e-maih colloquially referred to as "spam," It is 
estimated th of ah e-mail messages now received for delivery by major ISP 

e-mail servers is spam. 

[0019} I s of networl til services tc desirous of mechanisms to block e-mail 

containing viruses or worms sons tiad uu heir machines the vin on i nu> ea^th 

do luum 1x1 | i tHze presence). I ire also desirous i ihdinsms to block 
unsolicited commercial e-maii that consumes then- tun e and resources. 

[0020} Many commercia l e-maii services put a limit on each us er's e- mail accumulating at 

tbe.server,^ m ; mad line If too much e-mail aimes 

between times when the user reads hi'- ymnd additional e-mail js either "bounced" (i.e., 



9 



hi re, U.S. 1.0/654 ,77 1 



Changes made to 10/251,403 to create 
CIP app 1 0/654,77 i 



returned to the n ndei ad suxu ) or c\ cn smiph disc ai d ed both of whic h events ca n 

Miousi !C i v it v I ' Because ti er has no control o arriving e-mail due to t 
maii-borne .yisi^o- - a u.rms, or spam, it is a relatively common t uu iii < i jjh an tn u 

| ! > <! t ' ■ i >s i j v. ijv.l li I 'I S S ( i i i ! 

mail -home - - t ^ nason to limit the size of their messages. 

As a result these messages are often much larger than legitimate e-m ,r i _ t herein 
' ^ 1 ^ to the user by overflowing the per- user e-mail 

quota. 

|0021| Users are not the only group inconvenienced by spam and e- n - u viruses and 

ns. B i yj 1 l i \ a i mail c t u ; I . t i tial f cm • en a 

ef.^-m§tl.imffiQ..m...the f\\ c ^ s i i pdsof timeJSPs.^ 

resources to handle a peak e-mail load tha t would oiheisvne be ,;S « ; trge 1 i ns ratio of 
umvan.ted-to-Ieaitimate e-mail traffic appears to be growing daily. Systems and methods 
consistent with the principles of roe invention provide mechanisms to detect and discard 
unwanted e-mail in network e-mail servers. 

[0O22J /or prevent t h e transmission of malicious packets and trac e th e propagation of th e 
malicious pa ckets throug h a n etwork. Malicious packets, as used lierein. may include viruses^ 
worms, and other types of data w i th duplicat e d cont e nt, such a s ill e gal mass e -mail ( e .g., spam), 
tfe«t-we* e p eat- ^ 

1 ^ -hash e d-te--^ 

»Hft4? e-- ha^^ 

a packet may be hashed. 

EXEMPLARY SYSTEM CONFIGURATION 

FIG. 1 is a diagram of an exemplary system 100 in which systems and methods consistent with 
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the present invention may be implemented. System 100 includes mail ciient s autonomoos 
systems (ASs) 1 10[[-140]] connected to a mail server 120 via a ff public 11 network 130 li(PN) 
150.]] Connections made in system 1.00 maybe via wired, wireless, and/or optical 
communication path-. While FIG 1 show thro maj.l.d mi 10 ; d ^ir-aufoflometts- s ystems 
eeaa e eted-to -a single mail server i 20p ublie-«etwoyk, there can be more or fewer clients and 
sen'ers s¥Steffls - -and-^wot : fea in other implementations consistent with the principles of the 
invention. 

|0023| Network 1 30 may facilitate communication b etween m ail clients i 10 and mail server 
120. Typically. [[Public]] network .1 30 [[ 1 50]] may include a collection of network devices, such 
as routers [[(R1-R5)]] or switches that transfer data between mail client; I \t 1 mil rver 
120. maeEK?«KHttH s ys* e i^^ In an implementation 

consistent with the present invention, [[publ ic [[netw ork 130 may take j j" 1 50 takes]] the form of a 
wide area network, a local area network, an intranet, the Internet, an intranet; a public telephone 
network, a different type of network, or a combination of networks, 

j'0024'j Mail clients 1 10 may include personal computers, laptops, personal digital assistants, 

<„v |x j 3 reh devices tha ± able of Ut ^ • k_ ' 1> il serv< 20 t 

receive e-mails, in another implementation, clients 1 1 0 may include software operating upon 
one of these devices. Client 1 10 may present e-mails to a user via a graphical user interface. 
[9025] Mail server 120 may include a computer or another device that is capable of 

k 1 1 N ces for mail clients IIP. In another implementation, server 120 may 
include i n- :nm upon one of these devices. 

[9026] Fig. wid c area netwo rk (WA N), or t h e tiko . 

Afr fttt fc>neffl8ttfr » ¥S4^ ^ 
dem a kve a iv e -^ 

may includ e comput e rs or other types of communication devices (ref e rred to as "hosts") that 
e < »nnftel4^f)nM^ 
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A«toaoffle»s-s^tem 1 1 0 . for e xample, inel ud e s ho s t s f H) 1 1 1 1 I -3 conn e ct e d -in -a --LAN 

ee^%«fatte«-."}'fos4»"1"U~t43 connect to public network 150 via an intruder ■detee**ofH»yste«ft 

premis e used by an intrud e r det e ction syst e m is that malicious network traffic will have a 



■ t-wl e -s 



s4«ke«»d4Ky&e4 



s ystem-144 may take 



1 1 0: Wh e n a suspicious patt e nr or 
remedial action, or it can instruct a border t < 



s d e t e ct e d^ iiuruder defer 



r firewall to modify operation to address the 



the malicious traffic, dBear - dmg - packcts coming from a particiato s 
pack e ts addr es sed to a particular d e stination. 



e address, or discarding 



Awofummfr system 120 contain s di# e r e nt d ^ ^ ^ 1 10. The se -d e v i e e ; 

n*a-lkdmre--p^ 

FIG. I shows on l y autonomous system 120 as containing th e se d e vic e s, oth e r autonomous 
systemM«cluding autonomous system 1 1 0, may include them. 



rout e rs (SRI 1 SR1 1) 126 - 12 9 . Hoots 12! 123 m a y inclu de com pu ters or oth e r typ os of 

c-emnHink-atkM-d e ^ 

s yste»4^ - m^ 
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^ettf *t y -«e w e r - 4^ 

pegfeymS'-'Se»r€e-pafe^<leBtiifiGatie« wh e n a malicious packet is d e tected -by intrud e r detee tien 

424n-« : eH3&e j iW3^^ 

the pr e sent invention: 



IN) 2 is an exen , ■ -p urity se v e r 1 35 according to i 

implementation consistent, with the principles of the invention. Server 120ff 
configuration of s e curity se rv e r 1 25 is illu s trat e d - in ■ PIG. 2, oth e r eon figurations ar e - possible; 



Seear% - 6 e Fv e F - 4 - SS may include bus 210. [[a ]]processor 320IT202TL mam memory 230[[204]], 
read only memory (ROM) 240JJ206J], storage device 250, input device 260. output device 
27 0208; bos 2 1 display 21 2, k e yboard 2 1 4. cursor control 246 , and. communication interface 
280. Bus 210 permits communication among the com ponents of server 120. 
[0027J [[2 18. ]] Processor 220[[202]] may include any type of conventional processor or 
microprocessor that processin^ devke tbat- interprets and executes instructions. 

Main memory 230[[204j'j may include a random access memory (RAM) or another a -similar type 
of dynamic storage device that stor es . Main memory 204 may store information and instructions 
for execution to be execut e d by processor 220. 202. Main m e mory 304 may also be used for 

proe ess of-202 RO 1 'AO 2 6|] may include a conventional ROM device or another type of 
static storage dev ice that stores jjstore]] static information and instructions for use by processor 
220. St- ' i mj ol d njr medium and its 

ponding drive. 
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|G02S| Input device 260 may include one or more conventional mechanisms that permit an 

operato to nj ' iv i i s j >. such as a keyboan t p oum a pen, voice 

recogn it ioti .and/or bjome j dc .mechajij>n cic,. J. juini u . ' . ; m u\ include one or more 

conventi onal media r o m j l oj r. su d.ispko titer, a 

pan of speakers., etc. Communication interface 280 mav include any transceiver-like mechanism 
that enables server 120 to communicate with other devi<;e§..Md/of..$ystems,...Fo:f.e:xa«}p?.e, 
conrniumeation interface 280 may include .mechanisms for eonun i m .\ wh another device 
or system via a network, such as network 1 30. 

100291 As will be described in detail below, server 120, consistent with the present 
invention, provides e-mail services to clients i 10. while detecting unwanted e-mails and or 
preventing unwanted e-mails from reaching clients j 10. Server 120 202. It will be appreciated 
fea^QM 306 may b e ^ 

2 - 0 § r «l se -- f e fer - r e ^ or op&saka e &ft 

ar*44hetf~eer-re^^ 

Bus 210 may i n clude a set of hard ware lines (conductors, optical fibers, or the like) that allow 
for data transfer among th e compon e nts of s e curity se rv e r 125. Displ a y d e vic e 2 1 2 may b e a 

alt e r-n a ti-v e -een^^^ ^ 
mk^?l^# - ^ 
s e curity s e nd e r 125. 

Gommunieation int e rfaee 21 8 enabl e s security s e rv e r 125 to communicate with oth e r 
nm y 4ne-tede- a 4Be4em ^ 
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to --e nay « »e^^ 
networks- Cenmnane^^^ ^ 

2444oda - eiiitate operator or machine remote control and oomm«aicQtion-wkfe--^H«ty-sefvw 
4-2§r 



As will b e described in d e tail b e low, security s e rv e r 125 may p e rform sourc e path klentification 
and/er f i- e ven tio n me asures for a malicious packet that entered autonomous system 120. S e curity 
serve-; ay pt i"?m these tasks ftmct i on- ji m Kspomo to pi ocelot 2 ^> jj"trij v,\eamnu 

sequences of instructions contained in, for example, me i >n ] { >> ttt 4 i instructions 

may he read into memory 2301120411 from another computer-readable medium, such as storage 
device 250 or a carrier wave f [208}|. or from another device coup l ed to bus 210 orcenpkdvia 
communication interface 280. 

{0030| Execution of the sequences of instructions contained in memory 230 may cause 
processor 220 to perform process es that will be described later. [[2 18)). 

Alternatively, hardwired circuitry may be used in place of or in combination with software 
instructions to implement processt , d u- Mem u tth the present invention. Thus, processes 
performed by server 120 are not limited to any specific combination of hardware circuitry and 
software. 

(0031} i _ - s emplary fitnetiot t i ?0 according to an 

implementation consistent with the principles of the invention . Server 120 may include a Simple 
Mail 'han»k' Pro , ■ < S\f 1 P) block s 1 0. a Post Office Protocol (POP) block 320. an Internet 
Message Access Protocoi (IMAP) block 330, and a hash processing block 340. 
[9032| SM TP block 3 1 0 may permit mail server 120 to communicate with other mail servers 

connected to network 1 30 or another net k -M H ignec 'aui> and reliably 

transfer e-mail across networks. SMTP d efines the interaction between mail servers to facilitate 
the transfe .' \en uhen the -mn 1 ^w-'maeti^s-^f-^eeim^-sefv^-Sgv-yog 
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implemented on different types of computers or running 



cl i m ipa ] n y 

[0033} POP.b|ock 320 may.permU^ 

POP block 320 may fee designed jo aiways.receive incoming e-ma|L...POP.Moc.k.32Q jtiay.thgri 
boUi e-mail for maii clients 1 10 until ma V •, ' t'K«> t 'i j_< _. d ^ i > u ) them. 
[0034} i\t \P ''i i na £ >^ ide anothei meeh n n by wh i iN UP i m 

retrieve e-mail from maii set ver i 20. IMAP block 330 may permit mail clients 1 10 to access 
remote e-mail as if the e-mail was local to mail clients 110. 

[0035} Hash processing block 340 may interact with SMTP block 3 10. POP block 320, 
and/or IMAP block 330 to detect arid prevent transmission of unwanted e-mail, such as e-mails 
containing viruses or worms and unsolicited commercial e-mail (spam). 
[0036} Fig. 4 is an exemplary diagram of hash processing block 340 according to an 
im plementation consistent with the principles of the invention. Hash processing block 340 may 
include ha --' ■ . ros e * 1 d one oi more hash memories 420. Hash processor 410 may 

include a co nventional processor J [ j in]] an application specific integrated circuit (ASIC), a field- 
programmable gate array (FPG.A), or some other type of device^ie4lfeere^H>kme^« 

R e turning to FIG. 1 , s e curity rout e rs 1 36- 129 may include network devices, such as rout e rs, that 

nmy-d e *e^t--afi^%r^ 

i deH t- i - fK - ^ 

s e«nyity--r-ente*s4-2-3-42^ 

F-lG----3-- fe -an-- e^ mpfe^^^ 
e - ens is ten^ 

within a d e vice that taps one or mor e bidir e ctional links of a rout e r, such as security routers 1 26- 
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sttejHKrsecority routers436-I2j>. In the di s cussion tfa 



Packet d c 



tee-tieft- l egk-300-may include hash processor 3 1 0 and hash me 



320, Hash 



tb i nat - iea - ef 



these that generates one or more representation s for f f of ]]eaeh received e-mai I f [packet]] and 
records the e n aj] [p ( kt-t]] representations in hash memory 420. 
10037} An e-mailff320.11 

[[A packet]] rep? - i ! it ion will likeK not be a cop\ of the entire t -mail packi {.]] nut uithef it 
mavf fwillH include a portion of the e-mail ff packet]] or some unique value representative of the 
e-mail For pa€&etrBeean se -m^ 

contrast^ storing a value r e pr e sentativ e of the contents of a pack e t uses memory in a much mor e 
effiedi e « t-- t»a«n e f - . --- By - wa - y - of example, *f - inei> i mftg -- p^ - k^ 

b*t%-a fixed width number may be computed across portions of the e-mailfix«d-sk«d- b 4 eek s 
ntaki ng u p tb e ^ont e nHe r - payloa^Vef a p a ek et in a manner that allows the entire & 
mai1[[packet]] to be identified. To further illustrate the use of representations, a 32-bit hash 
value, or digest, may be computed across portio n s fc* e d -wge d44eek $ of each e-mail [[packet.]] 
Then, the hash value may be stored in hash memory 420 [[320 ]]or may be used as an index, or 
address, into hash memory 420. [[320. jj Using the hash value, or an index derived therefrom, 
results in efficient use of hash memory 420[[320]] while still allowing the content of each e^ 
mail [[packet]] passing thro 1 \ \ ci I jO pack e t - de te c t wu - lo - gic^tn) to be identified. 

Systems and methods consistent with the present invention may use any storage scheme that 
records information about one or mor e portions of each e-mail [[packct]1 in a space-effi cient 
fashion that can defmitiveiv detctmme fa portion ( i nail[[packetj] has not been observed, 
and that can respond positively (i.e., in a predictable way) when a portion of an e-maij [[packet]] 
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has been observed. Although systems and methods consistent with the present invention can use 

virtually any technique for deriving repress tat s of portion > > na i la p ae - kete; fe^ - bfr ev rty , tl ?e 

remaining discussion will use hash values as exemplary representations of portions of e-mai l s 
receive IJy nun I server 120.. 

(0038] ! i|t' « i lJ pi t ] ' 11 L ttion, S __*v_i 0 

may hash one or more portions of a received e-mail to p roduce a hash value u sed to facilitate 
hash-based detection. paA^s4^4ng-f)assed through a participating ^eaiefr: 

payk>ad-ieldfo For example, hash processor 4J0[ [3 i0]] 

may hash ong..er...mQm.qfe^-iH^<^»'Ve--64"byte b leek-fo ll owing the majn.texiMfein.the. 
message body, am. : .- ■■■ niv and one or more headei fields t ed fields 

(e.g.. "From;," "Sender:." "Reply-To:." "Return- Path:," and "Error-To:"). Mash processor 41 0 
may perform one or more hashes on each of the e-mail portions using the same or different hash 
functions, 

[0O39J . As described in more detail below, hash processor 410 j j 3 1 0] j may use the hash 
results of the hash operation to recognize duplicate occurrences of e-mails pocket content and 
raise a warning if the duplicate e-mail occurrences arrive within it d e t e cts packets with r e plicated 
eeatea Hw#H«"a short period of time and raise their level of suspic ion above some threshold. It. 
Hash-pr-ooes s oi'-340 may also be possible to use the hash results for tracing the path of an 
in toted i tn uja-mt^e4tHtvftaek-et through the network. 

{0040} Bach [[The jjhash value may be determined by taking an input block of dat ar s »ehra » 

a 04 byte block of a pack e t; and processing it to obtain a numerical value that represents the 

given input data. Suitable hash functions are readily known in the art and will not be discussed in 
detail herein. Examples of hash functions include the Cyclic Redundancy Check (CRC) and 
Message Digest 5 (MD5). 

The resulting hash value, also referred to as a mes hide j j is j]a 
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fixed length value. The hash value may serve ffservesll as a signature for the data over which it 
was computed. PfcHN * * - aB*f^ ^ 
th eir -content? 

The hash value essentially acts as a fingerprint identifying the input bloek of data over which it 
was computed. Unlike fingerprints, however, there is a chance that two very different pieces of 
data will hash to the same value, resulting in a hash collision. An acceptable hash function 
should provide a good distribution of values over a variety of data inputs in order to prevent 
these collisions. Because collisions occur when different input blocks result in the same hash 
value an ambiguits ma\ arise when attempting to associate a rest it with a particular input. 

flash processor 4K>[[3 10]] may store a representation of each enrjaii[[paeket]] it observes in 
hash raemon 420 jj S20 ] j Hash processor 4 1 Of f 3 1 0| f may store the actual hash values as the §z 
mail f [packet]| representations or it may use other techniques for minimizing storage 
requirements associated with retaining hash values and other information associated therewith. A 
technique for minimizing storage requirements may use one or more arrays o b - ifrflrmy-o r Bloom 
filters^ for storing hash values. 

Rather than storing the actual hash value, which can typically be on the order of 32 bits or more 
in length, hash processor 4 1011 31 Of) may use the hash value as an index for addressing an[[a bit 
jjarray within hash memory 420 : [[320.]] In other words, when hash processor 410; 13 1 0]] 
generates a hash value for a portiou fixed -sized block of an e-mail a-paek e i the hash value serves 
as the address location into the [[bit ]]array. At the address corresponding to the hash value, a 
count \ahk may bj [nc c'^^alteJ ^m e - ■ of-- ■ nH»r e -^h ^■ n^t¥4 : ^e■set at the respective storage location 
thus indicating thai a particular hash value, and hence a particular e-mail portion d atap aeket 
e^uten* , has been seen by hash processor 41 0. In out miplememattoty, the count value is 
tssociated w t] 'm lountet u uh a maximum value t l-ot -o.k- .n J:^ » 10 1 o> y-H!ii|-le 
n *i ng - a - ^5 - 4:> i t4^h - ^^ ^ ^ 
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a reay-Stor-ingot^ 

h i te4£w%i ; >fed * fe ^^ Whi e < i it]] arrays are desci ihed by 

way of example, ii will be appreciated by o bviou s to ^ those skilled in the relevant art, that other 
storage techniques may be employed without departing from the spirit of the invention. 
{0041 j Hash memory I ty stoi » < p , " ■ m that is edt ermnn the exeiali 
suspiciousness of an e-mail message. For example, the count value (described above) may be 

1 tpared I t threshoh n h uspiuton count for the e-mail may be incremented if the 
threshold is exceeded. Hence, there may be a direct relationsh ip betwee n the count value and the 
suspicion count, and it may be possible for the two values to be the same. The larger the 

i c x the e impoi > ou 1 be consid i in del < di 

suspiciousness of the.packet Alternatiyely.. the. guspicion.coum.ean.be 

function" with ..values from this or other h ash block s in the same message in order to determine 

whether the message should be considered suspicious. 

{'0042'J It is not enough, however, for hash memory 420 to simply identify that an e-mail 
contains content that has been seen recently. There are many legitimate sources (e.g.. e-mail list 
servers) that produce multiple copies of the same menage, addressed to multiple tct >p 
Similarly, individual users often e-mail messages to a group of people and, thus, multip le copies 
might be seen if several recipients happen to receive their maii from the same server. Abo, 
peopiejMte^ of recc ed messag e '« riends or co-workers. 

J0043J In addition, virus/worm authors typically try to minimize the replicated content in 
each copy < not be detected _ g i letection 

u hj i i ' _j ng fixed set i •■ ! ^ n a 1 s v in? m. Thes 

unable viri i it olyroorpihc, and the g n 

Iherecog nizabiiitA of the \ i rus or wor m by scrambling each cop\ in adi$ erem way,.,,,FoMhe 
virus or worm to remain viable., however, a small part of .it can be mutable in only a relatively 
small number of ways, because some of its code must be nnokdi ible by the victim's 

computer . and that limits f i mt \ meal initial code 

part. 
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|0044| In order to accomplish the proper classification of various types of legitimate and 
unwanted e-mail messages, multiple hash memories 420 can be employed, with separate hash 
mgmed^..420.Mag.U^ifo.r..gpei. >part 1 » Nta oda.td e-niail^^ 

d fjj ? 1 ! ] et irk ? can t b nl i J t n \ d i. g" i 

ficati i > ! i iLi 1 uv. is . .!_ t 

possibly estimate the probability that it belongs to a parriculai:c.l.ass..Qf Mffig. v >wc.fa.a$..a 
virus/worm message, spam, e-mail list, message, normal user-to-user message, 
100451 For e-mail following the Internet mail standard RFC 822 (and its various ..extensions), 
hashing of certain individual e-mail header fields into field-specific hash memories 420 may be 
useful. Am on g the hea der fields for which this may be helpful are: ( 1 ) varum-- t . i tied 
fields, such as "From:", "Sender:". "Reply-To:", "Return-Path:" and "Error- To:"; (2) the "To:" 
Meld (often a 1 ixed - aluc foi a mailing list, frequent iy missing or idiosyncratic in spam 
messages): and (3 ) the las t few "Receive d:" headers (i.e.. the earliest ones, since they are 
normally added at the top of the message), excludi n g a.i \ ■ >H i ■ r me^an '.p data ll may also 
be useful to hash a combination of the "From:" Held and the e-mail address of the recipient 
(transferred as part of the SMTP mail-transfer protocol, and not necessarily found in the message 
itself). 

{0046J Any or all of hash memories 420 may be pre-loaded with knowledge of known good 
or bad traffic. For example, known viruses and spam content (e.g.. the infamous "Craig 
Shergold letter" or many pyramid swindle letters) can be pre-hashed into the relevant hash 
memories -i,'* u-aJ <>< ; yt ioiU alK ;eire-;n d in the memory as part of a periodic "cleaning" 
process described below Also, known ley mate m » m hMs Opm 

leuitimau .. v -".:aii list servers can be added to a "I u»m ' h mem 120 that pa s 
v ,i k i. r e} mi! i 

[0047J Overtime hash nemorit 420 B.^morv-330 may fill up and the possibility of 

n \ iiig f» v - a - wf H - i«g an existing count fjindex jj value increases. The risk of overflowing a 
coun t overwriting an index value may be reduced if the counter arra ys are bit array is periodically 
flushed to other storage media, such as a magnetic disk drive, optical media, solid state drive, or 
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the like. Alternatively, the counter arrays bit- a fi-ay may be slowly and incrementally erased. To 
facilitate this, a time-table may be established for fhishin g eountei • the hit 

array. If desired, the Tlushing -erasm g cycle can be reduced by computing hash values only for a 
subset of the e-mails .received by niasi;,^>e; !. 20. packets passkg-tferough-t-te-fo^rtey:- While this 
approach reduces the flushing/erasing cycle, it increases the possibility that a target e; 
mail [[packet]] may be missed (i.e., a hash value is not computed over a portion of it). 
100481 Non-zero storage locations within hash memories 420 may be decremented 
periodically rather than being erased. This may ensure that the "random noise" from normal e- 
rnail traffic would not remain in a count ct ura; nJofimtely, Replicated traffic (e.g., e-mails 
containing a virus/worm that are propagating repeatedl y across the network), however, would 
normajlycan^ 
level. 

[00491 One way to de crement the count values in the counter array fairly is to keep a total 
count, for each hash memory 420. of every time one of the count values is increme nted. After 
ie threshold value iprobab m thi millions }, for every time a count 
I K anoti _ 2 ,\ i cmomed t )ne way to 

pick the count value to decremen t is to keep a counter, as a decrement pointer, that simply 
iterates through the storage locations sequentially. Every time a decrement operation is 
performed, the following may done; (a) examine the candidate count \ a; c t . cremented 
and if non-zero, decrement, it and increment the decrement pointer to th e text stoj age location: 

b? if il cou m val ero, then examine each sequemia >wi 

location until a non-zero count value is found, decrement that count value, and advance the 
I cerement porntt jj .mowing si >eafiort 

100501 It pray be ;.importanj[ to a \ o;d dee;er u .oamets below zerp^jvjyiejigt 

biasing decrements unfairly. Because it may be assumed that the hash is random, this technique 
should not (as oi an\ pai'iv i ^ Ji of them before starting over. 

This techn i que n o c \ i tixed total count 
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population across all of the storage location s, re presenting the most recent history of traffic, and 

i bj _ * t_ s in behavior a volume of traff se ti 

[0051] A variation of this technkjue ma y include landomh flee ting a count value to 

decremen t rather than proce ssin g them e yei icaih In this variation., if the chosen count value is 

' c hen an ofh u|d_bej <^ i u > t > c + 1 t in th ^ sragi, 

locations .following the initiaily-chosen one could be ex amined in series , until a ,noo-zero count 
value is found. 



■FI:GS : : . -4A" ... ^ 

32 0 in impl e m e ntation s con s i s t e nt with the p r inciples of the invention. As sho w n in F IG. 4 V 
{rash-memory-^ 

g e nerat e d by hash proc es sor 310, 



fe4teatef4*ei441£H^ th e 

As shown in FIG. -IB, hash memory 320 may store additional ■•iftfoi^at*e»"i^«tiag4e--a-^aefeet-: 
For exampl e , hash memory 320 may includ e link identifier (ID) fields 122 and statuo fi e lds 121 
Li»kdl>-fel4-^ 



may-b e y e fregran»^ 
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meiv«4'-pa6keK^iwt"0»ly"tbfr-feaslv valu es of previously observed packets- km also- to -hash 
values-of known malicious packets. 

Ia-y e tH»^tl^4mp{er*j^ 

may b e preprogramm e d to stor e sourc e addr e sses of known sourc e s of legitimate -duplicated 
eotrteaVsw?h--as-j>aefeet»-ff0fH-i>--mttteeftst server, a popular page on a web server, anoinptrt-from 
a mailing li^ ^^ 

duplicated cont e nt; 

EXEMPLARY PROCESSING FOR. UNWANTED E-MA I LM ALK^iS-PAGK - ET 
DETECTION /P R E VENT ION 
[0052J Figs. 5A-5B are flowcharts 

PIGr^4»^- flewel>art of exemplary processing for detecting and/or preventing transmission of 
unwanted e-inaii a-malieions paeket, such as an e-mail containing a virus or worm , including a 
polymotpl . i i ? uoni). or an unsolicited commercial e-mail (spam) , according to an 
implementation consistent with the principles of the invention. The processing of Figs. 5 A- SB 
wil l be des arms of a serie s of acts that may M 

implementations consistent with the princi ples of the invention, some of the a cts ma y be optional 

and/or performed in an order different than that described. In other implementations., different 

acts imn be =>uh J '.as or added to the process. 

{00531 F-tCiv-S-may be performed by packet deteette^4ogk-"&)9"W^^ 

router, such as security router 126, or other d e vic es configur e d to detect and/or prevent 
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Processing ma.) begin when h h u cess. [10 Fk I ? ^fcet4etee^ir4epfr40e~receives, or 
otherwise observes, an e-mail, message a pack e t { act 502? (Fig. 5 A). [[505}.]] Hash processor 
()[[.>] 5 i i n i > I > h i_ i h (act.504 

When hashing the main text, hash processor 410 may perfom ig ener - a - te one or more conventional 
hashes covering one or more portions, or all, of the main text. For example, hash processor 4 1 0 
may perform hash functions on fixed or variable hash valu e s by hashing successive, fixed- sized 
blocks of the main text, it may be beneficial for hash processor 410 to perform multiple hashes 

j i Liisin j i v 1 ) ! ictions 

(0054] i ty be t table | j 1 n i > \ m u rn 

' -ar-pk ofiM 

where spammers often insert random text strin gs in HTML comments between or within words 
of the text. Such e-mail may be referred to as "polymorphic spam" because it attempts to make 
each message appear unique. This method for evading detection might otherwise defeat the hash 
detection teihiuque, oi oth t- ■ mint matching te chni ques. Thus, removing all HTML comments 
from the message before hashing it may be desirable. It might also be useful to delete HTML, 
tag fro try I ie t u • ^a ge, or app ly other s pecialized, but simple, pre-processing techniques to 
remove conte nt not actually presented to the user. In general, this may be done in p arallel with 
the ha h u Q] n tgs lex] ina v i es and worms ma> be hidden in the mm-s i -vMc 
content of the message text. 

(0055} Hash processor I !0 nun d \ha h y y atta tm t ts lit r first ; > tempting to expand 
them if they appear to be known types of compressed files (e.g.. "zip" files) (act 506). When 
hi ;hing an attachment hash processor 41 0 ma> perform one or more com nt tonal hashes 
covering one or more portions, or ail, of the attachment. For example, hash processor 4.10 may 
perform hash functions on fixed or variable sized blocks of the attachment. It may be beneficial 
for hash processor 4 1 0 to perform multiple hashes on each of the blocks using the same or 
different hash functions. 
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[0056} Hash processor 410 may compare the main text and attachment hashes with known 

i ins, or spam patent in a hash s i > tha n loaded vs iformaii from 

known, viruses.,. worms,. and. spa n) o>mem (acts 50S an d 510),.,,Jf there are any. M&Jn this, hash 

i s \ 420 ha roj t thaj _ ! J ! • >.___. i sj try pam 

, i vii polymorph; i nay ha on • a small nun bei f hashes that mate! 11 tsh 
memory 420, out of the total number of hash blocks in the message. A non-polymorphic virus 
may have a very high fraction of the hash blocks hit tn hash memory 420 hoi thia re 
storage locations within hash memory 420 that contain entries from polymorphic viruses or 
worms may be gn e r» mon 5 :ght during the pre-loading process, such as by giving them a high 
initial suspicion count value. 

[00571 A high fraction of hits in this hash memory 420 mav cause the message to be marked 
as a probable known virus/worm or spam hi this case, the e- n i . t on be suicltaekeci 

feremedialac.^ 

{0058} _ ifcssage with a sign if ml "stoic" from polymorphic virus/worm hash value hits 
may or may not be a virus n orm in? j nice, and may be sidetracked for father investigation, or 

to determine the level of suspicion- 

(0OS9J For example, hash processor 410 may hash a concatenation of the From and To 
header fields of the e-mail messa g e (act 512) ( Fig. SB). Hash processor 410 may then check the 
suspicion counts in hash memories 420 for the hashes of the main text, any attachments, and tlte 
concatenated From/To (act 5 14). Hash processor 410 may determine whether the main text or 
attachment suspicion count is significantly higher than the From/To suspicion count, (act 5 1 6). If 
so, then the content is appearing much mote ttoquemh otrHdi tiu <> iges bei n tins set ol 
users (whicl gl v due to an e-mail i t -> age quotations) 

and, tin is muc j ore si > ci 

[0060) When this occurs, hash processor 410 mav take remedial action fact 518). The 
remedial act i j r amrnabte i.pj ^determined b^ 

an operator oi mail Svt sot 20 ^ tvn ip | d the e mad Ih»s 's 
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not recommended for anything botxiitualh-iesti " H s . v , M g icntilkadon. sue! 

perfect match to a known virus. 

[0061 1 As an a ln s n ate te Jmi que, .hash, proce ss' > U.0ma n I the e-..matl..>.N..ith..a\\atmng.m 
the messa ge body , in an addi tional heade;, or othe i n s er jj sib] e. j mot m c • < ; _at id. aliow the user to 

1 t 1 Sit * _ i J.c <J hat appc t_ "t horn an unknown m ailing list, a 

variant of this option is to request the user to send back a reply message to th e server, classifying 
the suspect message as either spam or a mailing list. In the latter case, the mailing list source 
address can be added to the "known legitimate mailing lists" hash memory 420. 
100621 As another technique, hash processor 410 may subject the e-mail to more 
so phisticate dj Ivji . • -turning) detection algorithms to make a more 

certain determination. This is recommended for potential unknown viruses/worms or possible 
detection of a polymorphic virus/worm. 

100631 As yet anot her technique, hash processor 410 may hold the e-mail message in a 
special area and create a special e-mail message to notify the user of the held message (probably 
including From and Subject fields). Hash processor 4 1 0 -iv ;, ,t/ - < „ ; ..».;> tmc^o:^ o n how to 
.rgM^yg. tfjgj»essage. 

{'0064] \s a In ler let gue, 1 tsh puxessor 410 may mark the e-mail mr yjjj it 
M,sr>: cK>n score result, but leave it queued for the user's retrieval If the user's quota would 
overflow whe n a new message arrives, the score of the incoming message and the highest score 
of the queued messages are compared. If the highest queued message has a score above a 
settabie threshold, and the new message's score is lower than the threshold, the queued message 
with the highest score may be deleted from the queue to make room for the new message. 

Ot.hu v. i-e, U k n m h thove tl ilui -.hold it ma> k 1 ] < ^ bounced" 

(e.g.. the sending em ) iii_s u ' is told to hold the m . age and i try it. later). Alternatively, if it 
.is destred to nevet bounce incoming merges, mail setsei 120 may accept the incoming 

1 * g * MlspKKin ^UHC 

from the que ue until the total is below the riser's quota ag ain. 



27 



hi re, U.S. 1.0/654 ,77 1 



Changes made to 10/251,403 to create 
ClPapp 1 0/654.77 i 



|G06S| As anothct tech \ n 1 i 0 may apply hash-based functions as the e- 

i >' ingj n th ic < x 'p \ rmj ji i < 

inerementa.Uy as the. message js.iL.adj n i tju igssag »sji.bigb lough.susffi cion score 
(above a threshold) during the early part; of the. lrses -., mail servei 1 20 i •> > ■ . \ i eject ..! he .message., 

rjsi lally with eith atry late >r /> n i >iu mi - resul t. ending sei erj Inch 

one is used may be determined by sellable thresholds app lied to the tota l suspicion. score,. and 
possibly other factors, such as server load). This results in the unwanted e-mail using up less 
network bandwidth and receiving server resources, and p enalizes server s sending unwanted mail, 
eg] I - c to those that do not, 

f 0066 J If the su; \ u y c ■ ; :u for the main text or any attachment is not significantly higher 
than the Froj J c sj u ion count {act 516), hash processor 410 may determine whether the main 
text or any attaehme-n s igniiicant re} e ed content (non-zero or hieh suspici on count values 
for many..hash. blocks ^ 

420) (act 520) (Fig. 5A). If not, the message is probably a normal user-to-user e-mail These 
types of messages may be "passed" without further examination. When appropriate, hash 
processor 410 may aiso record the generated hash values by incrementing the suspicion count 
value in the corresponding storage locations in hash memory 420. 

|0O6?J If the nit > u it ; tally replicated (e.g... greater than 90%).. hash processor 

41.0 may check one or more portions of the e-mail message against known legitimate mailing 
lists withi n I ' ' > 420 (act 522) (Fig. 5C). For example, hash processor 410 may hash 
iht T mm oi S nek v_< _ i hee ru message and compare it/them to known legitimate 
mailing Hst thii -.'i memory- 420. flash processor 1 10 m a;. . k -no whcthei the e- 
j_ t j ? ap peal f rom t rect soi rtl 1 _ i i i t v run < t 

example. tlu . -' ! Recen ed headers 1 l,i- u . , s i 0 .rmv /further examine a 

combination of the From or Sender fields and the recipient address to determine if the recipient 
has previously received e-mail from the sender. P--- s *\ pj„,f 1 ; $ I aty pical of 

unwanted c-n u-.i L u jv eh will normally n ss to the actual list c j i fu the 

mai ting list. Failure of this examination may simply pass the message om but mark it as 

28 



hi re. U.S. 10/654,771 



(Changes made to 10/251,403 to create 
ClPapp 10/654,77! 



''suspicious," since the recipient may simply be a new subscriber to the m a i ling l ist, or the 
mailings may be infrequent enough to not p _ , i the hash counters betwt n mailings 
[6068] jf.there is ..a matcjh.vyt h ; t itimat n ilingji > 52 ' hen the message is 
E 1 ! Jetut 'l j > yj ue a.i y be pa d wuh fitrtl nisi j is 

assumes thai the m i\mz hst s employs some kind ol tiltes ng to * h dj m van ted e matj 
(e.g.. refusing to forward e-mail that does not originate with a kno wn list .recipient or .refusing e- 
mail with attachments). 

[0069] if there is no match with any legitimate mai i u a remot> 420, hash 

processor 410 may hash the sender-related fields (e.g.. From, Sender, Reply-To) (act 5261. Bash 
: v >r 4 i 0 rnav \ determine the suspicion count for the sendei t 
memories 420 (act 528). 

[0070] Hash processor 410 may determine whether the suspicion counts for the sender- 
related hashes are sim ilar to the suspicion count/s ) for the main text hash(es) (act 530) (Fig. 3D), 
if both From and V k is are present then ( e Senck l ie id. should match w vh ,,>i lT the 

saj ue { \ h n c ount Ju*_ s t!u m es? age body hash. The From field may or may not match. 

For a legitim ate mmling h-i -t m;r v "e a l egitimate . mailing Ji^ 

l egitimate mailing lists hash memory 420 (or in the ease where there is no known legitimate 
mailing lis ts hash memory 420). If only the From field is present., it should match about as wel l 

as the nw - .. mailing lust. If none o ' ■■• . le - dated fields match as well as the 

message text, the .e-mail message may be considered moderately suspicious (probably spam, 
with a variable ant i Tic-tit io i : < mm address o? the like) 

[0071 J As an additional check, hash processor 4 1 0 may hash the concatenation of the sender- 

' 1 1 ■' pnum count %a ue and the e-mail recipient's address (act : 532). 

Hash processor 410 may then cheil cue m p s cion count for the concatenation in a hash memory 
420 used just for this check (act 534). If it matches with a significant, su^ (act 
536? (Fig. 5 Ft, llu . io n tin-, mhiiu', 

which makes it probable that it is a ntaiiing.list.T.h 
fuuher exam 
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{0072| If the message text or attachments are mostly replicated {e.g., greater than 90% of tire 
hash blocks), has w h mostly low suspici i h hk t ' M, then the 

message is probably a cage o + im aj soak re] li , i « ssage to multiple recipients 

In this c ase, the e-mail mess a ge may then he pa ssed without further examination. 
[0073} If jihet«e§§agg.igxto) angilanc-nts >, ■ c sec ol content 

re plica don (say , gi ea tea J < . v - ^ a hie^ have 

high suspicion count \ahns i ,„ iseo ; 12(3 (act 540), then the m - ; , s> faul\ likely to be 
a virus worm or spam. A virus or worm should be cons idered mo re likely if the high-count 
matches are in an attachment, if the highly-replicated content ts in the message text, then the 
. to be spam, though it is possible that e-mail text employing a scripting 
language (e.g., Java script) might also contain a vims. 

j0074| Iftl it on is m the message te\t and the suspicion count is substantially higher 

for the message text than for die From field, the message is likely to be spam (because spammers 
general!) ' mpler spam fil i e made foi 

the concatenation of the From arid To header fields, excep t that in this case, it is most suspicious 

; ! i i n c 1 ! i ej ! , h i tin in< i in !l i e set it 

ordinarily send e-mail to that recipient, snaking it unlikely to be a mailing list, and very likely to 
be a spammer (because they normally employ random or fictitious From addresses). 
|0075| In the above cases, hash processor 4IO may take remedial action (act 542) , The 

1 u taken by hash processor 4 1 1) may v at » - lescj a! abo\ e. 
[0076J pac ket's pay load field (act 510). Hash proces^^HO - -R^y-tt^-a-€tm¥ef^4eftal 

m ay be p^ I f 
eft e -ey-wior - fr-ef-tfeft-generated hash valued match one of the hash values of known virases-aiiH&ey 
woHn%4*a s43-pr^ ^ 
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n ^ y- include rai s -^ 
f e i f #m * g4ftm>a^^^ 

possibly ether -packets- originating- from th e s ame Internet Protocol- f IP) address as ft e paefcet^ 

(and likely to -cause th e r e ceiv e r to drop the packet). 
tf-th e gea 

deteen t n e w n e th e r th e pack e t's soore e address ind i cat e s that th e pack e t was sent - feetn a 
legitimate source of duplicated packet content (i .e., a legitimate "replicator" ) <act530). For 

and cheek the source address of the packet with the addresses of legitimate replicators on the list: 
If th e' packet's sourc e addr e ss matches th e addr e ss of one e f < he l e gitimate repl ica tors; ■ then hash 
|>foe e& S0f"^4 - O"H>a - y"eHd"pr - 6ee ^ g -ing of t h e packelr^r-«x<^jpleyffoee8sk^-fla8y"r«tt»»-te-ftet-S0S- 
and r -await ? e c e ipi of the next packet: 

Qtfeerwise r has^ 

generated hash value(s) as an addr e ss into hash m e mory 320. Hash proc e ssor 310 may th e n 
e x-aiBiftfr-mdicator field 412 (FIG. 4) at each address to dete^me-wfeefeeHbe^fte^e^«H»re^^ 
stored th e rein indicate that a prior packet has be e n rec e i ved. 

tfth e Few e f^^ 
t -e eor#^^^ 
■34&4 ra> y-- s e4~#K ^ er re HW-^^ 
genem te 44ia s- l^^ 
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3 4&rPt^e s shtg-ffl ^ ^ 
If hashpreeesser 3 
pTO€es ;K>iKM#4Hay -4j^^ 

with the same hash value have to b e observ e d by hash proc e ssor 310 b e fore the pack e ts ar e 
ld e Mifi€ i d"as-j>eteitjftHy"mftHe - k>i ts- . - "Th e "f»ks might al s o specify that these pack e ts hove to have 
be e a-oh s erved '' ^ 

mtdtiple pack e ts will lik e ly pass through pack e t d e t e ction logi^ 
A - pae& e Hftay - ^onta^ ^ ^^ 

packets; Fof exontple^ Q packet tha t includes multtpl e 4m s l > 4> l eek- !jr -may have somewhere between 
on e and all of its hash e d cont e nt blocks match hash blocks associ a ted with prior pack e ts. Th e 

tfrat ft ee tl to - inateb ^b^ 
snt4i - #fr^ ^ ^ 

predetermined number of th e packet blocks with the sam e hash values ar e obs e rved or when the 

pae-ketsare o b s er v ed outside the specified period of ttme,-4ia3 i ^pre€«3 s ^ 

gen e rat e d hash va.l»e(s) in hash m e mory 320 (act 540), For exampl e , hash proc e ssor 3 10 may set 

vate s -rtolndka^ 
Pfoe e ^agH S B « y-fe 
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may ^ e-r^edi ^ 
fe - p a ek e t4s - #t4^ 
a legitimate repfe 
fxic^et-aetnaiiv^&ift^^^^^ 

human -analysis, dropping th e pack e t, corrupting the pack e t content in a way lik e ly to render any 
eode-^emaffl e d-thgfeffl-ifl e ft^fflid - ^k -e iy- t o - cau a e t he r e ceiver to drop the packet), delaying 
transmission of the 
fete. * pp i ng^ 

messag e to the s e nd e r th e reby pr e v e nt i ng compl e t e transm i ssion of the packet; and/or 
disconnecting the link on which the packet was received. Some of the remedial actions, such as 
dK>ppk * g - o* - c - 8fmpfe 

tmlioious-iS ' abovesome-rfiire s boyi-l^is may greatly slow the spread rate of a virus or worm 
without compl e t el y stopping legitimat e traffic that happ e n e d to match a susp e ct profil e . 

EXEMPLARY PROCESSING FOR SOURCE PATH IDENTIFICATION 

F-K4/-6--is-a--^ 

th e principles of the inv e ntion. The processing of FIG. 6 may b e p e rform e d by a s e curity server, 
suefe-afr-seewity server 125, or other devices configut-ed^faoe^t^^-^^-^"^^^^ 
packets. In oth e r implementations, one or more of the d e scrib e d ac ts may b e p e rformed- by other 

Pfoe es tktgHtf^ I n t mfe 

de4 e €4ien-- s y^m42^^^ 
e i * a ft * p k ^4n t fndef - 4 e ^^^^ 
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paftof-an-abiMr^ 
d e t^4k » n -sys a e B ^^^^ 

within -autoaemeas- system 4 20v Th e notification ■ may ■ imMe4hQ'm»UeiBm'f)mk^'m'P@s^&m 
thereof -a jeng-wiliv^^^ 

information, link information, and the lik e . 
Alter-r- e e e ivin^ 

participating rotif e rs, s uch a s ■s ecitrityTO ; atBT ; s-}24"129 ' (actS ' 6Q5"a i Hl-61 : 0>"¥vxaiBples--of 
additional information that may be included in the query are, but are not limited to, destination 
addf e ss e s4e? - fHH - tk4^ 

■H-HWnmtK>n time - to - iiv'e and the like: 

S e curity serva-"l'2§"iinay'then'$end"the"qaery"tO'Seeur'it>' rout e rfs) l ocat e d on e hop away (act 
%4-£) : --4-he see«tf& yH? ettte*foO^^ 
mal t ekftHr p ack et: ^ 

response may indicate that th e security rout e r has se e n the malicious pack e t, or alt e rnatively, that 
k-has-notv-lt is important to observe that the two answers are not eqtiaHft-d^f-degfe e -el 
c e rtainty. If a security rout e r does not hav e a hash matching th e malicious packet, the -security 

how e v e rrtnea^ 
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m»tef(s)-je€ate44l«-ee4H>f>S"awayr-ftiMl so on. Th js ^^vardingmay^ 
dev4e ^s - 4¥rtMn - pui^^ 

out approach -because th e qu e ry travels a path that, e xt e nds outward from security s e rv e r 125. 
A-k^m#veiy;-aft-OHtwaf4"»»-ftpproach may be used. 

S e ew4 t y -- ^^w^^ ^ 

security neut e rs have se en th e ■ «)alicjo« S' pai-k e t|ac^"630^nd ' 625):4fa ' T e spo»s e' iadicat e s4hat 
the security router has seen the malicious packet, security server 125 associates the response and 
id e Milkat^ 

Alternatively, if t he respoas e -H^icate:! that the security router has not seen the malicious packed 
s e eurity ■ server 425 associat e s th e r e sponse and th e JD informa t ion for th e s e curity rout e r ' with 
inactive path data (act 635). 

S e euf4 t y --se i^ e r4^ 

tak e n-by-tln^^^ 

sef¥ef433Haaay^^ 

routers (acts 640 and 6 - 15 ). S e curity s e rver 125 may att e mpt to build a trac e with each receiv e d 
respometo determine the ingress point for the malicious fa^feetr4%e4«gfes»fet«f^ay-t^»^ 
where th e malicious pack e t e nter e d autonomous syst e m 1 20, public network i 50, or another 

fMths-^ a y- e me ^ -as-- a -r e stdt--of-k> s fe 
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eee*refene e« ^8£j^^ 

to -compete --large -hash -valu e s -over th e pack e ts sinc e th e chancer <?f-e^M^e»»-«6e'«»^e'-a«t«ber 
t>Pblts €6:»iftrisj-ftg the hash value decreases. Another mechanism to reduce fal s e-positives 
r^ w l teg4%om -- ee^^ 

A-fetfogr-m 
ffl e fflo» e »^f 

a -s ing l e -bit for an ob s erved pack e t; a plurality -of hash va i ti es may be comp n t e d - -each 
observed packet using several unique hash functions. This produces a corresponding number of 

rate^ the reduction in the number of hash collisions makes the tradeoff worthwhile in many 
kstanee s r For e xampl e . Bloom Filters ma y b e us e d to compute multipl e hash vahi e s-ov e i' a giv e n 

W-iK^s^uraty--se^^ 

(act 650). Security serv e r 125 may also take remedial actions (act 655). Often it will be desirable 
te-feav e -fehe participating router closest to the ingress point-«lese-e#tbe4«gpe^^ 
malicious packet. As such, s e curity server 125 may s e nd a m es sag e to th e respective 

tfafr4&e-eitl ^ 466 ^ 
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network: -R>r-e ^ n^^ 
ep e ra*kw ^- € e n^ 

EXEMPLARY PROCESSING FOR DETERMINING WHETHER A MALICIOUS PACKET 

FIG. 7 is a flowchart of e xemplary proc e ssing for d e t e rmining wh e ther a malicious Vpacket, such 
a s -a -viftB -&r-w-OftB r l>as been ohs e rv e d - aeeording to on implementation consistent with the 
prffle4p4es- ' #f{j^ 

eenfigw e d to trac e th e patfos tak e n by ma l iei ous packet*. In oth e r impl e m e ntations; on e or mere 
of the described acts may be performed by other systems or devices within system 100. 

Processing ' may begin when security router 126 receives a query from security server 125 (aet 
705). As described abov e , th e qu e ry may includ e a TTL fi e ld. A TTL field may be e mploy e d 
beeatt9e4t-pr<avMes-^*ef%^ seenr i t - y - router-FesjDO - fKis-ea l y - to 
g e levaaVorlim e ^ 
t f a v e f fr ing- t l^^ 

'wi#>-expifed-4'T-l-j-l : klds-fHay-fee discarded.- 

If th e query includ e s a TTL fi e ld, security router 126 may d e t e rmin e if the TTL field in the qu e ry 
hafr- e x-pif-ed (act 710). If the TTL field has expired, secidity-i^^ 

(act 715). If the TTL fi e ld has not e xpir e d, s e curity router 126 may hash the malicious packet 
e^at»«ed"w i tfe ^^ 

affp & af -- atr^ 
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Se^nity-imtta--^ 








to address into hash memory 3 20- At each -of the addresses;; -seonr-iry 




whether indicator field 412 indicates that a priof-paek-frt-with-tfee-same 
¥ed:4f-ttene-ej^fe-g^ 







negative respons e to security server J 25 (act 735); 
s e ew4ty - *e * tf e r^ 

the direction from wh i eh th e query was rec ei ved ■ (act 74 0): Sec urity router 126 may a be -s e ed- a 
positive response to security server 125, indicatirtg that the packet has been observed (act 745). 

packets that have passed through security -router 4 26^ 

CONCLUSION 

Systems and methods consistent with the present invention provide mechanisms withiti an e-mail, 
server to detect and/or prevent transmission of unwanted e-mail such as e-mail containing 
viruses or wo m nc ndmg poly mogmic viruses \ ; tis., and > i 3 uual e-mail 

(spam). 

(00771 Implementation of a hash-based detection mechanism in an e-mail ...server at the e- 
J ' 1 1 1 1 ^ ( ' 1 packet-based" in i'un t it \ m a router or other 

riciH.'ik node dcx-'. ? or example, the entire e-mail message has been re -assembled, both at tire 

packet level (i.e., IP fragm ent re g . and a; the application Sesei (multiple packets into a. 

i pies nail n 1 Vho. the hashing algorithm can be applied ihgemly to 

pet ific p; >!' ! iM dm sag . header i Ids, m sgc bod) md f v hn e tj } 
meats that 1 } n_ t 1 for transp _ ip can 1 j ( 
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inspection. Wi 1 i > i i with no 

icpeatabv hash signature vi ramp m le\el 

[6078] Withjhje entire me^^ to; « .i . I pa; ofjhejha* ina process. p acket 

boundaries and packet to^n.^i.t^i '-.'M iA- <<os -■;■!<! ■■.cq-icniC'. t i 1 \ tes that might otherwise 
prj dt sefui tasj ^naiiac \ - 1 tracker might < iei < ' a vinus m attach 
fay causing the IP packets carrying the malicious code to be f ragmented into pieces smaller than 
that for which the hashing process is effective, or fay forcing packet breaks in the middle of 
otherwise- visible fixed sequences of code in the virus norm. Also, the entire message is likely 
to be longer than a single packet, thereby reducing the probability of false alarms (possibly due 

r _ i o ient 1 ( fa 1 rj < 1 ii j cks p r pad tiid increasing ih 

1 § i block^.Mll.mtck^r.mg§sagg, 

i ' > ( > s heunh paits of the entire messag e). 

Also,.iewer.hash-block..d 
intelligently alie nee th fit h s ■ he - mail menage, such a s t he s t art of the message body, or 
the start of an attachment block. This results in raster detection of duplicate contents than if the 
blocks are lando'f I i s applied n> uuhudual pa- . • 

[0080] E-mail-borne maiicious code , such as viruses and worms, also usually includes a text 
message de signed to cause the user to read the message and/or perform some other action that 
will activate the malicious code. It is harder for such text to be polymorphic, because automatic 
scrambling of the user-visible text will either render it suspicious-looking, or will be very' limited 

in \ a; lability fh fa it combined with die ability to stag a hash blo ck at the start of the 

me tgej fa nsj «_ v <_l header, reduces the variability in hash signatures of the 

i tee makin g 1 i t etet r_e> arsi| 'en 

{0081 J FiiiilKi a i > extt ndj pecific he Hum an e-maii message 
^cp natch iv « e _ d to help J>sify the type of replicated content the message body carries. 

! ny legi >a >e fjVK g he 1 i us! s 5 ic unling lists. s( ch < 
Yahoo Groups), intelligent parsing and hashing of the message headers is veiy useful to reduce 
e fal i y \ , aj d f > t < rea ; ^ ... ura A (k ction of .i i ruses >rn i i p in 
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{0082| This detection technique, compared to others which might extract and save fixed 

1 ! i v 1 y t i i hash 4) 1 filtei !< t pj ; 

ns (,i e fc it is possible i" i f n *te si it iui sMe«.se!en,befOTe,in 

another m :ualSy impossible 

consu i | i my pict f a pi ox lg ti 1 t b tvpa J t • > tgh 1 

filter previously. Thus, this technique can maintain the p rivacy of e-mail, without retailin g, any 
information that can be attributed to a specific sender or receiver. 
100831 &£Kl4faee4hei^ 

The foregoing description of preferred embodiments of the present invention provides 
illustration and description, but is not intended to be exhaustive or to limit the invention to the 
precise form disclosed. Modifications and variations are possible in light of the above teachings 
or may be acquired from practice of the invention. 

For example, systems and methods have been described with regard to a mail server, ft elswrffr- 
lev e l devic e s. In other implementations, the systems and methods described herein may be used 
within otha dev i ; 3 a mail client. In such a case, the mail client may periodically 
obtain suspicion count values for its hash memory from one or more network devices w&h-e 
» te*^a4e» e < l e ¥ ^ 
m 41-relayhosts{e.g^ 

WM e ~ser4e s -of-ae te -te 

addition, non -dependent acts may be performed concurrently. 

Further, c e rtain portions of the invention have be e n d e scribed as "logic " that p e rforms one or 
mefe^tmetreaar^fe^ such as a mail server. 
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1 0084 1 It may be possible for multiple mail servers to work together to detect and prevent 
m] ij > i i ex an h ni ti m > I » 1 mer n 1 i_ L i_ 1 e 

m.ighLbedismfa 

-a>!k- A -.H.j v, jM u : ' s ,e: - f Si; - p- ,.v , cceierate the detection proce is. especially for mail servers 

expe ce relative!} v _U___ 1 ! - 
Further, certain portions of the invention have been des cribed as " blocks" that perform one or 
more functions. These blocks may include hardware, such as an ASIC or a PPG A, m 



combination of hardware and software. 

No element, act, or instruction used in the description of the present application should be 
construed as critical or essential to the invention unless explicitly described as such. Also, as 
used herein, the article "a" is intended to include one or mote items. Where only one item is 
intended, the term "one" or similar language is used. The scope of the invention is defined by the 
claims and their equivalents. 




^software, or a 
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